Departures from Tree Structures in Discourse: Shared Arguments in the Penn Discourse Treebank
نویسندگان
چکیده
The term discourse structure is used to denote any structure of a text above that of the sentence. Trees have often been posited as a good abstraction when discourse is taken to have a hierarchical structure (Mann and Thompson 1987; Webber et al. 2003; Marcu 2000; Egg and Redeker 2008). Nevertheless, periodically researchers have commented on the need to depart from the strict singleparent hierarchy of trees to structures which have shared daughters, a move which incorporates multiple inheritance and is therefore an issue for tree representations. This study follows up on the observation in (Lee et al. 2006) about the relative ubiquity of shared structures in the Penn Discourse Treebank or PDTB (Prasad et al. 2008; PDTB-Group 2008)), a recently released corpus which annotates discourse relations and their arguments. We limit our investigation here to cases where the shared discourse structure is a syntactically subordinate clause introduced by a subordinating conjunction (e.g. because, although, when, etc.). We examine annotations in the PDTB where the subordinate clause has been taken to be an argument of both the relation associated with the subordinating conjunction and another relation expressed in the immediately subsequent discourse. We ask what such annotations imply about the link between syntactic subordination and discourse subordination. Our argument is that while syntactic subordination may often correlate with discourse subordination, there are interesting exceptions that might better be captured as discourse coordination. We provide some systematic characterization of these exceptions by appealing to well-motivated discourse factors, and discuss their implications for tree structures.
منابع مشابه
Annotation And Data Mining Of The Penn Discourse TreeBank
The Penn Discourse TreeBank (PDTB) is a new resource built on top of the Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of standoff annotation allows integration with a stand-off version of the Penn TreeBank (syntactic structure) and PropBank (verbs and their arguments), which adds value for both linguistic discovery and discour...
متن کاملExperiments on Sense Annotations and Sense Disambiguation of Discourse Connectives
Discourse connectives can be analyzed as discourse level predicates which project predicate-argument structure on a par with verbs at the sentence level. The Penn Discourse Treebank (PDTB) reflects this view in its design providing annotation of the discourse connectives and their arguments. Like verbs, discourse connectives have multiple senses. We present a set of manual sense annotation stud...
متن کاملAttribution And The (Non-)Alignment Of Syntactic And Discourse Arguments Of Connectives
The annotations of the Penn Discourse Treebank (PDTB) include (1) discourse connectives and their arguments, and (2) attribution of each argument of each connective and of the relation it denotes. Because the PDTB covers the same text as the Penn TreeBank WSJ corpus, syntactic and discourse annotation can be compared. This has revealed significant differences between syntactic structure and dis...
متن کاملRecognizing Implicit Discourse Relations in the Penn Discourse Treebank
We present an implicit discourse relation classifier in the Penn Discourse Treebank (PDTB). Our classifier considers the context of the two arguments, word pair information, as well as the arguments’ internal constituent and dependency parses. Our results on the PDTB yields a significant 14.1% improvement over the baseline. In our error analysis, we discuss four challenges in recognizing implic...
متن کاملSearching in the Penn Discourse Treebank Using the PML-Tree Query
The PML-Tree Query is a general, powerful and user-friendly system for querying richly linguistically annotated treebanks. The present paper shows how the PML-Tree Query can be used for searching for discourse relations in the Penn Discourse Treebank 2.0 mapped onto the syntactic annotation of the Penn Treebank.
متن کامل